Device, method and computer program for 3D rendering

ABSTRACT

The present disclosure improves 3D representation (by means of adjusting disparity between pictures) in stereoscopic or autostereoscopic images in order to better fit the natural viewing geometry by modifying each image of a stereo pair with a geometric transform centered on an intersection point of an imaging plane and a gaze direction for each of eyes. An observer&#39;s view is modeled independently for the left and right eyes, and the left and right views are processed independently with the goal to re-align pixel directions for each of eyes to make these directions more in correspondence with natural vision for the observer.

FIELD

The present disclosure generally relates to a device, a method and acomputer program for 3D rendering.

BACKGROUND

3D rendering is usable in various fields such as a 3DTV, 3D displays, 3Dgames, 3D glasses, and so forth.

Virtual Reality/augmented reality are particularly facing the issue ofcorrespondence between the real and the virtual world and can benefitfrom the embodiment of the present disclosure as well.

The embodiment of the present disclosure improves 3D representation (bymeans of adjusting disparity between pictures) in stereoscopic orautostereoscopic images in order to better fit the natural viewinggeometry.

Classically, for rendering, pinhole camera models are used which are notin accordance with the way human eyes capture and exploit angles inspace for depth perception.

According to the present embodiment, a new model is used taking intoaccount individual characteristics to modify the projective geometry.

Today, the prevalent 3D content creation model is the double perspectiveprojection (or double pinhole camera model). It is widely used for bothsynthetic and natural content, either as a GCI camera model, or as arepresentation model for real cameras. It is used as well as basis for3D image processing of many sorts.

As shown in FIG. 1, perspective projection predicts x/d=X/D and y/d=Y/Dfor each of the two eyes.

Binocular observation is classically modeled by a double perspectiveprojection, one corresponding to each eye or camera directedperpendicularly to the eye plane.

With this model, as shown in FIG. 2, the disparity Δx between views fora point to be shown at a distance D and for a display at a distance d ispredicted to beΔx=(e·d)/D  (1)

where “e” is the interocular (or inter-pupillary) distance.

Today this basic pinhole camera model is widely used to design cameras,in 3D computed generated imagery (CGI) or for 3D images or videoprocessing. This generates 3D images and video with scale/depthdistortion or incorrect 3D perspective and motion distortion whencompared to human vision.

Compared to natural viewing, the pinhole camera model creates imagesmore difficult to visualize, although they are accepted as 3D images bythe human visual system. That is, the correspondence with the naturalspace is only partial, valid only for small visual angles around theattention point.

Some solutions have been proposed to attenuate the annoyance (depthscaling/image scaling) such as a solution previously proposed by theapplicant giving control of depth strength to the user, or as in thefollowing paper:

“Mapping perceived depth to regions of interest in stereoscopic images”,N. S. Holliman, in Stereoscopic Displays and Virtual Reality Systems XI,Proceedings of SPIE-IS&T Electronic Imaging, SPIE Vol. 5291, 2004” (seehttp://www.dur.ac.uk/n.s.holliman/Presentations/E15291A-12.pdf)

The objective of the present embodiment is to improve a projection modelin order to obtain a better 3D representation and rendering for a betterspatial matching between an intended (designed or captured) space andits representation on a 3D display. As a result, the 3D rendered objectswill appear more natural to the observers.

According to the present disclosure, an observer's view is modeledindependently for the left and right eyes, and the left and right viewsare processed independently with the goal to

re-align pixel directions for each of eyes to make these directions morein correspondence with natural vision for the observer.

SUMMARY

According to one aspect of the present disclosure, each image of astereo pair is modified with a geometric transform centered on anintersection point of an imaging plane (screen plane where the right andleft image are displayed) and a gaze direction for each of eyes.

Other objects, features and advantages of the present disclosure willbecome more apparent from the following detailed description when readin conjunction with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates perspective projection;

FIG. 2 illustrates disparity computed from double perspectiveprojection;

FIG. 3 is a flowchart illustrating a 3D rendering process according toan embodiment of the present disclosure;

FIG. 4 illustrates an OpenGL function gluLookAt( ) used to define leftand right positions and orientations in a classical camera model;

FIG. 5 illustrates on-axis (fixation) point and off-axis points;

FIGS. 6A, 6B and 6C illustrate disparity gradients;

FIG. 7 illustrates internal optical system deviations/scaling;

FIG. 8 illustrates one example of a hardware configuration of a devicefor 3D rendering according to the embodiment of the present disclosure:

FIG. 9A illustrates a human eye optical system (schematic eye);

FIG. 9B illustrates image formation based on the schematic eye of FIG.9A; and

FIG. 9C illustrates a reduced eye based on the schematic eye of FIG. 9A.

DESCRIPTION OF EMBODIMENT

A preferred embodiment of the present disclosure will be described withreference to the accompanying drawings.

The embodiment of the present disclosure is applicable to various fieldssuch as a 3DTV, 3D displays, 3D games, 3D glasses, and so forth, andrelates to enhancement of user experience with a better representationof 3D scenes improving depth perception. It deals with imagerendering/adaptation aspects.

As described above, according to the present embodiment, an observer'sview is modeled independently for the left and right eyes, and the leftand right views are processed independently with the goal to re-alignpixel directions for each of eyes to make these directions more incorrespondence with natural vision for the observer.

The realignment is applied on ‘standard’ 3D observer images or oncomputer graphics camera models i.e. the input images were rendered oracquire with algorithms or camera models working with a doubleperspective projection (double pinhole camera model).

As will be described later with the flowchart of FIG. 3, after a firststep, the disparity for the fixation point is adjusted to be at theintended distance (Step S3 in FIG. 3). Then, each view (left and right)is independently processed correcting first a potential keystoning (andastigmatism) (Step S6) before applying a radial transform (Step S7).

Basically, the radial transform is a central scaling (centered on thevisual axis) while higher order coefficients would represent opticaldistortions.

Then, inverse keystoning re-projects back the transformed image in thescreen plane (Step S8).

Processes of keystoning, radial transform and inverse keystoning (StepsS6-S8) can be combined in a single image processing step.

By the present embodiment, advantages can be expected such as a betterimmersion for the observer in a 3D scene (realness), a bettercorrespondence between an intended (designed or real) space and itsrepresentation on a 3D display and a better quality of experience forthe observer due to an enhanced perceived naturalness of the content.

According to the present embodiment, stereo pair views are modified withgeometric transforms centered on the point of fixation and depending onparameters determined for a given observer.

The point of fixation will follow an actual eye gaze (given by eyetracking) or follow a predicted eye gaze (content based processing) asin the following paper:

“A coherent computational approach to model bottom-up visualattention.”, Le Meur, Olivier, Patrick Le Callet, Dominique Barba, andDominique Thoreau, in Pattern Analysis and Machine Intelligence, IEEETransactions on 28, no. 5 (2006): 802-817.

The parameters are determined by an interactive test procedure or theparameters are determined by an adaptive process along withvisualization.

The specific configuration of the embodiment will now be described indetail with reference to the drawings.

The physiological parameters of human vision will be described later onwhich is based the following model of the embodiment.

[Transform to be Applied on Images]

According to the embodiment, a spatial (angular domain) transform isapplied to 3D presented content, dependent on the fixation point andcentered on the visual axes, in order to take into account the variousoptical and non-optical effects in human vision with the goal to enlargethe field of fused 3D.

FIG. 3 is a flowchart of a 3D rendering process according to the modelof the present disclosure embodiment.

Steps S1-S9 are executed for each image pair.

In Step 51, a screen is located in 3D space. That is, a rectangle islocated in 3D space where the left and right views are displayed.

In Step S2, the eyes fixation axes are determined. That is, either byeye tracking or by 3D attention modeling, the eyes fixation axes aredetermined. This determination takes into account the potential opticaldeviation of prescription glasses.

In Step S3, the disparity of the fixated 3D point is adjusted by imageshift so that the fixated 3D point perceived distance Dp corresponds tothe intended distance Di.

In Step S4, when Dp becomes equal to Di in Step 3 (YES), Steps S5-S8 isexecuted.

When Dp does not become equal to Di (NO), Steps S2-S3 are repeated untilDp becomes equal to Di (YES in Step S4).

Steps S5-S8 are executed for each of eyes.

In Step S5, the intersection point <<C>> between the fixation axis andthe imaging plane is identified.

In Steps S6-S8, the transform process is carried out centered on theintersection point <<C>>. As mentioned above, the three steps S6-S8 canbe combined in a single image processing function.

In Step S6, the keystone transform process is carried out. Note that, incase of convergence or lateral viewing, the keystone transform processworks in a plane orthogonal to the fixation axis (this transform mayaccount for potential astigmatism).

In Step S7, the radial transform process is carried out with the center<<C>>. As mentioned above, the first order is a central scaling (e.g.,linked to accommodation) and the higher orders represent opticaldistortions.

In Step S8, the inverse keystone process is carried out to re-projectback the transformed image in the imaging plane.

In Step S9, the thus acquired image pairs are actually displayed.

The process of FIG. 3 starts from the identification that 3D perceptionis based first on monocular retinal images. For each individualobserver, binocular fusion is trained/adapted to his/her daily lifeenvironment. The left and right monocular images, in order to be fusedcomfortably to create a 3D percept, have to respect, or be closer to,the geometry of natural life observation, for both still scenes andscenes in motion.

The process of FIG. 3 has the goal to transform stereoscopic viewsinitially created for a ‘standard’ 3D observer as mentioned above.

[Transform to be Applied on 3D Graphics Cameras]

Another model, based on the same principles, can be derived for a CGIpipeline in the form of a transformed camera model.

In this regard, a classical camera model as defined in OpenGL languagehas the following form:

This is an excerpt from:“http://www.dgp.toronto.edu/˜hertzman/418notes.pdf” (Copyright c 2005David Fleet and Aaron Hertzmann).

“§ 6.10 Camera Projections in OpenGL

OpenGL's modelview matrix is used to transform a point from object orworld space to camera space. In addition to this, a projection matrix isprovided to perform the homogeneous perspective transformation fromcamera coordinates to clip coordinates before performing perspectivedivision. After selecting the projection matrix, the glFrustum functionis used to specify a viewing volume, assuming the camera is at theorigin:

glMatrixMode (GL_PROJECTION);

glLoadIdentity( );

glFrustum (left, right, bottom, top, near, far);

For orthographic projection, glOrtho can be used instead:

glOrtho (left, right, bottom, top, near, far);

The GLU library provides a function to simplify specifying a perspectiveprojection viewing frustum:

gluPerspective (fieldOfView, aspectRatio, near, far);

The field of view is specified in degrees about the x-axis, so it givesthe vertical visible angle. The aspect ratio should usually be theviewport width over its height, to determine the horizontal field ofview.”

To realize a transform according to the present embodiment, theglFrustum is used with the adequate scaling applied on the fieldOfViewparameter of the gluPerspective( ) function. This scaling will realizethe first order of the radial function centered on “C”.

Higher order distortions have to be realized as warping on the resultantimage of this modified camera model, for each view.

Stereoscopic acquisition is realized by a function with the camera axesoriented towards the attention point in the 3D CGI scene. The OpenGLfunction gluLookAt( ) can be used to define the left and right eyeposition and orientation.

gluLookAt(eye_(x), eye_(y), eye_(z), center_(x), center_(y), center_(z),up, up_(y), up_(z)) (see FIG. 4)

The function gluLookAt creates a viewing matrix that allows defining thecamera position and to look at a specific point in the scene. [Test toderive the transform parameters]

The transform according to the present embodiment requires theparameters measured or estimated for a given observer, one by one, asfollows. An absolute coordinate system is supposed known with the originO and the axes Ox, Oy and Oz.

Screen:

The screen absolute position is defined in space, for example, by thegeometric coordinates of three corners of the light emitting surface.Conversion from the space coordinates to pixels is supposed known on thelight emitting surface.

Eyes Position:

The coordinates of the center of rotation of the left and the right eyesof an observer are supposed known either from an assumed head positionand interocular distance or from an eye tracking measurement.

Fixation Axes:

The fixation axes are defined by two points in space for each of theleft and right ocular geometries, i.e., an eye side point and a sceneside point.

Eye Side Point:

Either the center of rotation of the eye, or a point derived from thecenter of rotation knows or assumes the eye structure and geometry. Sucha derived point can be the center of the eye pupil, the eye nodal point,the fovea or foveola, etc.

Scene Side Point:

A measurement or an estimate of the fixation point in space for each ofeyes is potentially not the same point for both eyes in case of squint(strabismus) for example. The measurement can be performed using a gazetracking system. If a measurement is not available, an estimate can beobtained predicting the region of interest by analyzing the contentpresented to the observer.

The fixation axes are two lines in space potentially crossing on thefixation point, corresponding to a specific area of interest of thecontent displayed on the screen.

Potentially, depending of the observer, these two fixation axes (lefteye and right eye) do not cross at all, or do not cross at the intendeddistance from the observer (according to content design).

Image Shifts:

A physical object is used (e.g. the corner of a diamond shaped card),placed at the intended fixation point in space at a known distance (anacquisition device like a webcam, a kinect, an ultrasound or a time offlight system may help determining the physical distance of the physicalobject and of the observer's face from the screen plane) from theobserver's eye plane. Simultaneously, the intended fixation point isrendered on the screen, potentially surrounded with static or dynamicimage structures stabilizing the observer's gaze. If these two lines donot cross at this point, the content images are translated (pixelshifted) one relative to the other, horizontally and potentiallyvertically, for the perceived fixation point to match the intendedfixation point in terms of distance and eccentricity.

This shifting process may be done iteratively, reevaluating the fixationaxes after each shift step.

In case of significant occlusions, a more complex processing may benecessary than image shifting, based on view re-interpolation dependingon local disparities.

For Each of Eyes, Keeping Fixation

Transform Center “C”

The transform center “C” is determined as the intersection of thefixation axis and the imaging plane for the eye under consideration.

Keystone Transform

A plane Pk in space is determined as passing through the fixation pointand being perpendicular to the fixation axis for the eye underconsideration.

The keystone transform is realized as the projection in space of theimage content pixels towards the eye side point and onto the plane Pk(potentially approximated by an orthogonal projection on the plane Pk).

Scale Factor and Distortion:

A physical object is used (e.g. a diamond shaped card) placed with onecorner at the intended fixation point now adjusted. Simultaneously thesame shape as the physical object is rendered on the screen.

The size of the shape is adjusted interactively by the observer using ascaling transformation centered on the fixation point. This isequivalent to a radial scaling with a function r′=s·r, r and r′ beingthe radial distances to the fixation point in the Pk plane.

The observer may observe the stimuli binocularly or chose to close theeye not under consideration to realize the scaling adjustment.

Distortion can be estimated by matching points at different radialdistances on the physical object with points on the rendered shape. Thefunction in this case can be a higher order polynomial of the form:r′=s ₁ ·r+s ₂ ·r ² +s ₃ ·r ³ . . .

In case of anisometropia or if the person observes the screen from aside, the scaling factor of the polynomial coefficients may be founddifferent from one or the other eye.

Anisotropic distortion can as well be estimated in case of astigmatism.Comparing the physical object and the rendered shape, a horizontal and avertical scaling will be performed.

Dynamic Aspect:

As the parameters vary depending on the fixation point distance andeccentricity (as depending on glasses prismatic deviation, convergence,accommodation or pupil size), the parameters have to be estimated forseveral fixation point positions in space, an interpolation providingthese parameters for intermediate positions.

Resulting from the tests and from the interpolation based on thefixation distance and eccentricity, the 3D model parameters according tothe present embodiment are available for all the fixation positions in avolume facing the observer.

As mentioned above, the present embodiment is applicable to 3DTV, 3DDisplays, 3D games, 3D glasses, or any visual immersion applicationneeding an enhancement of depth perception and coherence betweenintended (real or designed) and displayed 3D scenes.

Adapting equation to a non-planar screen rendering the model can beapplied to curved surface displays.

[Specific Aspects of Human Vision]

The classical 3D projection model defines a simple, ‘standard’, 3Dobserver based on perspective projection. However, this is a first orderapproximation, and human vision does not have the characteristics ofbasic perspective projection.

Physiological Parameters of Human Vision

Many parameters are source of differences or deviations with regard to asimple perspective projection model, as follows. These parameters shouldbe taken into account:

-   -   Inter-pupillar distance    -   Convergence    -   Ocular media index: n′≈4/3    -   Accommodation    -   Prescription glasses    -   Visual system scaling    -   . . .

All these differences create a deviation of light rays, i.e. adistortion of light ray angles before reaching the retina. The modelwhere the image formed on the retina is angularly homothetic to aperspective projection of the scene is not representing accurately humanvision. Here are some effects explained:

Inter-Pupillar Distance

Inter-pupillar distance sets the scale for binocular vision as theabove-mentioned formula (1) shows. This is a primary individual factorand its variability is important (5.2 cm to 7.8 cm according to“Computer Graphics Lecture Notes, CSC418/CSCD18/CSC2504, ComputerScience Department, University of Toronto, Version: Nov. 24, 2006”).However varying the “e” parameter is often the tool used to lower S3Dvisualization discomfort or fatigue, and some study (see “Rosenberg, L.B. (1993, September). “The effect of interocular distance upon operatorperformance using stereoscopic displays to perform virtual depth tasks.”In Virtual Reality Annual International Symposium, 1993., 1993 IEEE (pp.27-32)”) even reports that an “e” value as small as 3 cm (while averageis ˜6.3 cm) does not affect operator performance in a virtual realitytask. This shows the humans have a wide range of acceptability for S3Deven if strongly physically inappropriate “e” values are used.Acceptability is however masking 3D perception concerns.

Convergence

The eyes converge when presented a proximal stimulus, depending onvisual attention (see “Q. Huynh-Thu, M. Barkowsky, P. Le Callet, “Theimportance of visual attention in improving the 3D-TV viewingexperience,” IEEE Trans. Broadcasting, vol. 57, no. 2, pp. 421-430, June2011”). The direction of the visual axis is then modified for each ofthe eyes when converging. A first effect is the change of retinaldisparities compared to a situation where the two fixation axes wouldremain parallel (angular shift of each view according to convergence).Every disparity presented to the viewers has to be reassessed accordingto their convergence state. A second effect of convergence is reducinginter-pupillar distance as the pupils get closer when the eyes rotate toconverge. The effect of eye point location is analyzed in “Vaissie, L.,Rolland, J. P., & Bochenek, G. M. (1999, May). Analysis of eyepointlocations and accuracy of rendered depth in binocular head-mounteddisplays. In Conference on Stereoscopic Displays and Applications X,Proc. SPIE (Vol. 3639, p. 57)”. As well convergence modifies the opticalaxis orientation for each of eyes. As these axes are reference(symmetry) axes for optical transforms of the eye, convergence impactsthe way images presented on a screen are mapped onto the retinas.Convergence micropsia can be consciously perceived particularly forstrong negative disparities/strong convergence but shall be as wellpresent at a lesser degree for lower convergence levels.

Ocular Media Index: n′≈4/3

Snell/Descartes law applies when light rays cross the cornea surface.Ocular media have a refractive index n′ close to a value of 4/3. Thismakes the eye a “Thick lens” where projection approximations usuallyused with thin lenses (central projection thru the optical center) donot apply. A ‘single point optical center+symmetric focal points model’has to be replaced at least by a six cardinal points model (i.e., 2focal points+2 principal points+2 nodal points). Furthermore, as theocular media refractive index is not close to 1, ray directions insidethe eye do not match ray directions of light incident from the scene.From inside the eye, monocular distances to points in the scene appeardivided by n′. Binocular perception is modified creating perspectivedistortion dependent on convergence.

Accommodation

Accommodation modifies the power of the eye by modifying the eye lenscurvatures. This affects the ray path to the retina, basicallygenerating a scaling of the projected image (see “Johnston, E. B.“Systematic distortions of shape from stereopsis”, Vision Research, 31,1351-1360 (1991)”). Angular distortion may as well appear whenaccommodating, the deviation increasing with the angular distance of ascene point to the optical axis. The inventors experienced in acontrolled experiment that objects displayed with uncrossed disparitiesappeared flattened while objects displayed with crossed disparities wereperceived elongated in depth. On the basis of perceptual estimation, itis possible to rearrange the disparity map to correct theflattening/elongation distortion (see “C. Vienne P. Mamassian and L.Blonde, “Perception of stereo at different vergence distances:Implications for realism” submitted to the “International Conference on3D Imaging” (IC3D), December 2012”).

Prescription Glasses

Prescription glasses as well modify the ray path between the observedscene and the retina (see “Alonso, J., & Alda, J. (2003). Ophthalmicoptics. Encyclopedia of Optical Engineering, 1563-1576”). Up- ordown-scaling, depending on the prescription power, as well asdistortion, may happen. Another effect is linked to convergence, with aprismatic effect modifying the chief ray of the observed point ofinterest beam.

Visual System Scaling—Size/Depth Constancy

Non optical scaling has been identified, linked to perceptual constancyand originated in the post-retinal visual system (see “Murray, S. O.,Boyaci, H., & Kersten, D. “The representation of perceived angular sizein human primary visual cortex”. Nature neuroscience, 9 (3), 429-434.(2006)”). Perceptual constancy refers to the fact that our perception ofobjects are relatively constant despite the fact that the size ofobjects on the retina vary greatly with distance. It reflects a tendencyto perceive stimuli as represented in the environment rather than asrepresented on the retina. Although the source of this effect is stilldiscussed amongst authors, it affects significantly size and depthperception depending on an estimate of the distance between the observedobject and the observer. This estimate of viewing distance is presumablyrelying on the actual vergence and accommodative states (see “Watt, S.J., Akeley, K., Ernst, M. O., & Banks, M. S. “Focus cues affectperceived depth”. Journal of Vision, 5 (10). (2005)” and “Mon-Williams,M., & Tresilian, J. R. “Ordinal depth information from accommodation?”.Ergonomics, 43 (3), 391-404. (2000)”), and on the size ratios ofhorizontal and vertical disparities (see “Rogers, B. J., & Bradshaw, M.F. “Disparity scaling and the perception of frontoparallel surfaces”.Perception-London, 24 (1), 155-180. (1995)”), with distance dependence.

Off-Axis Vision:

While a number of physiological parameters of human vision have thusbeen described, here is explored the aspect of off-axis vision with theprospect that it participates strongly to the feeling of immersion, whena rendered scene corresponds correctly to reality in the non-axial fieldof view. Indeed, when the two eyes converge at a fixation point (seeFIG. 5), there is a range of disparities, known as Panum's fusional areawhere the two images, one for each of eyes, generate the percept of asingle object, appearing with perceived depth. Outside this area,observers experience diplopia (double images), leading sometimes tobinocular rivalry, or more often to binocular suppression (see “Howard,I. P., & Rogers, B. J. (1995). Binocular vision and stereopsis. OxfordUniversity Press, USA”). As of today, when observing a scene on a 3Dscreen, it can be noticed that only a restricted field of view generatesdepth perception. Above some degrees away from the fixation point depthis no more perceived and lateral parts of the scene look flat. If gazeorientation changes towards them, these regions will be seen with depth,as becoming the ‘on-axis’ regions. On the other hand, in natural vision,the field of depth perception is larger, corresponding to an acquiredcapacity of the visual system to perceive depth at larger field angles.This capacity is modified, for example when wearing glasses with amodified prescription or when switching between prescription glasses andcontact lenses. It takes some time for the visual system to re-adapt toa new angular projection of the surrounding world on the retinas. Fusionbetween the left eye view and the right eye view has to be reorganized,which is possible thanks to brain plastic capacities (plasticity).

While the classical 3D model considers only one point in space (or i.e.one pixel, a fraction of degree) and the two pupil centers to computedisparities to be displayed, binocular perception exists in a much widerfield of view as the angle subtended by the foveola of each of eyes isabout 1 degree and the angle for the fovea is about 6 degrees, thiswithout considering parafoveal perception, also capable of binocularfusion although with less precision. The goal of an extended 3Dprojection model according to the present embodiment is to widen, toenlarge, the field of view where depth is perceived.

FIGS. 6A, 6B and 6C (from FIG. 2.7 of “Howard, I. P., & Rogers, B. J.(1995). Binocular vision and stereopsis. Oxford University Press, USA”)present the notion of disparity gradient G as being the relative angulardisparity η between the images of two objects divided by their angularseparation D: G=η/D. The authors report that “two dichoptic (viewing aseparate and independent field by each of eyes, in binocular vision, asfor example in a haploscope, or more generally a stereoscopic display)images do not fuse when the disparity gradient with respect to aneighboring fused pair of images exceeds a value of about 1”. FIG. 6Ashows a state of two objects with a disparity gradient of less than 2;FIG. 6B shows a state where two objects on a visual line of one eye havea disparity gradient of 2 (the visual line may, or may not, be a visualaxis); and FIG. 6C shows a state where two objects on a hyperbola ofHillebrand have a disparity gradient approaching infinity. The disparitygradient is the angular disparity, η, between the images of two objectsdivided by their angular separation D. The separation is the anglebetween the mean direction of the images of one object and the meandirection of the images of the other object. In this case, one object isa fixated black dot and the other object is a circle producingdisparity, η, and with a mean direction indicated by vertical dottedlines.

Considering again a perspective projection model in the ‘standard’ 3Dobserver model, it is used independently for the projection of eachpoint in space, considering in each case an on-axis observation for eachof eyes. This model does not consider a neighborhood of points, withdisparity gradients, that are simultaneously seen by a larger field onthe retina, including off-axis regions. In this case, when the fieldangle increases, the limit on disparity gradient can be broken and depthperception disappears. In that case, non-linearities in the human eyeprojection need to be taken into account for rendering correctly the 3Dstimuli or adapting 3D views.

In other words, while a simple projection model may be valid for on-axisprojection (paraxial or Gaussian approximation); off-axis points, innatural viewing, are likely not seen in the directions predicted by adouble perspective projection model. These off-axis points will bepresented in a ‘transformed’ direction for each of eyes, so that the 3Dstimuli presented to the observer corresponds to his/her natural vision.This process is observer dependent, according to the parametersdescribed above.

As shown in FIG. 7, an internal optical system deviation/scaling will betaken into account. Thus, the 3D stimuli will be transformed to presenta point M′ to generate a perceived point M, correctly fused and at theright spatial location. In fact, when attention is on the point A, thedirections from the point M are difficult to or impossible to fuse whilethe directions from the point M′ are correctly fused. In FIG. 7, Ltrdenotes left eye transform while Rtr denotes right eye transform.

Based on the above description, according to the present embodiment, aspatial (angular domain) transform is applied to a 3D presented content,dependent on the fixation point and centered on the visual axes, inorder to take into account the various optical and non-optical effectsin human vision with the goal to enlarge the field of fused 3D.

As mentioned above, the process in the model according to the presentembodiment starts from the identification that 3D perception is basedfirst on monocular retinal images. For each individual observer,binocular fusion is trained/adapted to his/her daily life environment.The left and right monocular images, in order to be fused comfortably tocreate a 3D percept, have to respect, or be closer to, the geometry ofnatural life observation, for both still scenes and scenes in motion.

The process of the model according to the present embodiment has thegoal to transform stereoscopic views initially created for a ‘standard’3D observer as mentioned above.

It is noted that another model, based on the same principles, can bederived for a CGI pipeline in the form of a transformed camera model.

As described above using FIG. 3, the process of the model according tothe present embodiment is the independent processing of the left andright views having the goal to re-align pixel directions for each ofeyes, starting from ‘standard’ 3D observer images. After a first stepadjusting disparity for the fixation point to be at the intendeddistance, each view (left and right) is independently processedcorrecting first a potential keystoning (and astigmatism) beforeapplying a radial transform. Basically, the radial transform is acentral scaling (centered on the visual axis) while higher ordercoefficients represent optical distortions.

Next, a device 100 for 3D rendering according to the present embodimentwill be described.

As shown in FIG. 8, the device 100 for 3D rendering according to thepresent embodiment includes a Central Processing Unit (CPU) 110, aRandom Access Memory (RAM) 120, a Read-Only Memory (ROM) 130, a storagedevice 140, an input device 150 and an output device 160 which areconnected via a bus 180 in such a manner that they can carry outcommunication thereamong.

The CPU 110 controls the entirety of the device 100 by executing aprogram loaded in the RAM 120. The CPU 110 also performs variousfunctions by executing a program(s) (or an application(s)) loaded in theRAM 120.

The RAM 120 stores various sorts of data and/or a program(s).

The ROM 130 also stores various sorts of data and/or a program(s).

The storage device 140, such as a hard disk drive, a SD card, a USBmemory and so forth, also stores various sorts of data and/or aprogram(s).

The input device 150 includes a keyboard, a mouse and/or the like for auser of the device 100 to input data and/or instructions to the device100.

The output device 160 includes a display device or the like for showinginformation such as a processed result to the user of the device 100.

The device 100 performs 3D rendering described above using FIG. 3, orso, as a result of the CPU 110 executing instructions written in aprogram(s) loaded in the RAM 120, the program(s) being read out from theROM 130 or the storage device 140 and loaded in the RAM 120.

[Example of Transform Taking Into Account Eye Parameters andAccommodative Scaling]

Next, an example of transform taking into account eye parameters andaccommodative scaling will be described.

First, n′ is defined as an eye refractive index average or approximation(actual eye, or average eye model).

Definition of the model parameters are as follows:

-   -   n′=4/3    -   f₀=50/3 (=16.666 . . . mm default)    -   N₀N′₀=P₀P′₀(=0.3 mm default)    -   L, R: eyes position in space    -   A=Attention point position in space    -   Kappa angle: angle between visual and optical axes    -   Screen geometric characteristics

Scene points in space are input.

3D or multiview images on the screen are output.

The above-mentioned six cardinal points eye model (unequifocal system)is illustrated in FIGS. 9A, 9B and 9C (according to“http://www.telescopeoptics.net/eye_aberrations.htm”).

The schematic human eye of FIG. 9A shows average values for the sixcardinal points eye model.

As shown in FIG. 9B, image formation is such that an image of a point“object” at the height h in the outer field is determined by two raysoriginating at the object point, one normal to the 2nd principal plane(normal to the axis at the 2nd principal point P′), and the otherpassing through the anterior focal point and turning parallel with theaxis at the first principal plane (normal to the axis at the firstprincipal point P). The significance of nodal points is that the rayfrom object point directed toward 1st nodal point (N) acts as if itfollows the axis to the 2nd nodal point (N′), from which it connects tothe image point keeping the original ray orientation. In other words,nodal points define the actual incident angle, i.e. actual field ofview. The apparent incident angle, and field of view, determined by theray passing through the center of exit pupil (i.e. intersecting the axisat a point between nodal and principal points)—the chief ray—is smallerthan the actual angle and field; for the schematic eye, by a constantfactor of about 0.82.

As shown in FIG. 9C, the “reduced eye” is a simple representation of theeye as a unequifocal system (different focal length in the object andimage spaces due to a different refractive index).

An observing position is defined by a viewing eye position in space. Itcan be a left or right eye in stereoscopy, or a position determined byone of the multiple views of an autostereoscopic display.

For each view, the following processes (1), (2), (3) and (4) are carriedout:

(1) From the attention point A position in space and an observingposition O_(P) in space, a Visual axis D_(V) in space, the line passingthrough A and O_(P) are determined.

(2) If the Optical axis is not approximated to the Visual axis D_(V), anOptical axis D_(O) in space is determined (using Kappa angle).

(3) From the attention point A position in space, an accommodative focaldistance f depending on an initial focal distance f₀ and of distancePA_(P) (=f_(acc) the focal length of a complementary lens correspondingto accommodation) is computed, where P is principal point of the eyemodel and A_(P) is the orthogonal projection of A on the Optical axisD_(O).

The thin lens formula

$\frac{1}{f_{acc}} = {{\frac{1}{f_{0}} - \frac{1}{f}} = \frac{1}{\overset{\_}{{PA}_{P}}}}$gives

$f = {1/{\left( {\frac{1}{f_{0}} - \frac{1}{\overset{\_}{{PA}_{P}}}} \right).}}$

(4) From f and f₀, a magnification factor m=f₀/f is computed.

For each scene point M in space (in absolute scene coordinates), thefollowing process is carried out:

an Intended direction D_(I) is determined for each scene point Mconsidering the Observing position N′ nodal point (N′ is obtained bytranslating N on Optical axis D_(O) by an observer dependent NN′distance);

knowing the Optical axis D_(O), the Observing position N′ nodal pointposition, and Intended direction D_(I), a Required direction D_(R)passing through eye N nodal point position is computed;

a (L or R) image point I_(M) as being the intersection of the Requireddirection D_(R) and the (L or R) imaging plane is determined; and

to obtain a final projected point I_(M), the magnification factor 1/m isapplied to the image point J_(M) in the (L or R) imaging plane,respective to the centre of the magnification I_(O), where I_(O), is theintersection of the imaging plane and the Optical axis

${D_{O}\left( {\overset{\_}{I_{O}I_{M}} = \frac{\overset{\_}{I_{O}J_{M}}}{m}} \right)}.$

For stereo, the process is repeated for the left and right viewindependently. For multiple view displays, the process can be repeatedfor each view independently.

For stereoscopy, the scene can then be observed binocularly, the lefteye being presented the left image and the right eye being presented theright image.

Similarly, for autostereoscopy, the scene can then be observedbinocularly from the screen sweet spot point of views.

Note that the specific example described above as [Example of transformtaking into account eye parameters and accommodative scaling] relates tothe process described above using the flowchart of FIG. 3 and concernsthe derivation of the scaling factor in “Transform centered on <<C>>”(Steps S6-S8) of FIG. 3. The scaling factor (magnification factor) ism=f₀/f and the center of magnification I₀ is the same point as the point<<C>> in FIG. 3.

The derivation of “m” in this example is only an example based on theuse of the six cardinal points eye model described above. Influence ofcorrective glasses magnification is not taken into account here.

Thus, the device, method and program for 3D rendering have beendescribed in the specific embodiment. However, the present disclosure isnot limited to the embodiment, and variations and replacements can bemade within the scope of the claimed disclosure.

As will be appreciated by one skilled in the art, aspects of the presentprinciples can be embodied as a system, method or computer readablemedium. Accordingly, aspects of the present principles can take the formof an entirely hardware embodiment, an entirely software embodiment(including firmware, resident software, micro-code, and so forth), or anembodiment combining software and hardware aspects that can allgenerally be referred to herein as a “circuit,” “module”, or “system.”Furthermore, aspects of the present principles can take the form of acomputer readable storage medium. Any combination of one or morecomputer readable storage medium(s) may be utilized.

A computer readable storage medium can take the form of a computerreadable program product embodied in one or more computer readablemedium(s) and having computer readable program code embodied thereonthat is executable by a computer. A computer readable storage medium asused herein is considered a non-transitory storage medium given theinherent capability to store the information therein as well as theinherent capability to provide retrieval of the information therefrom. Acomputer readable storage medium can be, for example, but is not limitedto, an electronic, magnetic, optical, electromagnetic, infrared, orsemiconductor system, apparatus, or device, or any suitable combinationof the foregoing. It is to be appreciated that the following, whileproviding more specific examples of computer readable storage mediums towhich the present principles can be applied, is merely an illustrativeand not exhaustive listing as is readily appreciated by one of ordinaryskill in the art: a portable computer diskette; a hard disk; a read-onlymemory (ROM); an erasable programmable read-only memory (EPROM or Flashmemory); a portable compact disc read-only memory (CD-ROM); an opticalstorage device; a magnetic storage device; or any suitable combinationof the foregoing.

Thus, for example, it will be appreciated by those skilled in the artthat the block diagrams presented herein represent conceptual views ofillustrative system components and/or circuitry embodying the principlesof the invention. Similarly, it will be appreciated that any flowcharts, flow diagrams, state transition diagrams, pseudo code, and thelike represent various processes which may be substantially representedin computer readable storage media and so executed by a computer orprocessor, whether or not such computer or processor is explicitlyshown.

The invention claimed is:
 1. A method of rendering a 3D scene on animaging plane for a viewer using a 3D rendering device configured rendera stereo pair, the method comprising: determining a first intersectionpoint of a fixation axis of a left eye with an imaging plane, carryingout a first keystone transform of a left view to a plane orthogonal tothe fixation axis of said left eye resulting in a keystone transformedleft view, carrying out a radial transform of said keystone transformedleft view with said first intersection point as a center resulting in aradial transformed left view, and carrying out an inverse keystonetransform of said radial transformed left view resulting in a processedleft view, wherein said inverse keystone transform is defined as theinverse of said first keystone transform, determining a secondintersection point of the fixation axis of right eye with the imagingplane carrying out a second keystone transform of a right view in aplane orthogonal to the fixation axis of said right eye resulting in akeystone transformed right view, carrying out a radial transform of saidkeystone transformed right view with said second intersection point as acenter resulting in a radial transformed right view, carrying out aninverse keystone transform of said radial transformed right viewresulting in a processed right view, wherein said inverse keystonetransform is defined as the inverse of said second keystone transform,wherein keystone and radial transforms are performed utilizinginterpolated parameters, said interpolated parameters being aninterpolation between parameters estimated for a plurality of points ina 3D space, and rendering said stereo pair comprising said processedleft view and said processed right view.
 2. Method according to claim 1,wherein the radial transform for the left eye is performed based on atleast one left eye radial scale factor sL1, sL2, sL3, sLn of differentorders applied to a radial distances “rL” to said first intersectionpoint on the said plane orthogonal to the fixation axis of said lefteye, such that the transforming radial distancerL′=sL1·rL+sL2·(rL)2+sL3·(rL)3+sLn·(rL)n, and wherein the radialtransform for the right eye is performed based on at least one right eyeradial scale factor sR1, sR2, sR3, sRn of different orders applied to aradial distances “rR” to said second intersection point on the saidplane orthogonal to the fixation axis of said right eye, such that thetransforming radial distance rR′=sR1·rR+sR2·(rR)2+sR3·(rR)3+sRn·(rR)n.3. Method according to claim 2, wherein respective the radial scalefactors of the left eye and right eye are determined by an interactiveprocess by the observer, wherein the interactive process comprisingscaling a rendered shape horizontally and vertically by comparing therendered shape with a shape of a physical object placed at an intendedfixation point.
 4. Method according to claim 2, wherein said at leastone left eye radial scale factors sL1, sL2, sL3, sLn and said at leastone right eye radial scale factor sR1, sR2, sR3, sRn are dependent onsaid observer's physiological parameters respectively for his/her lefteye and for his/her right eye.
 5. Method according to claim 4, whereinsaid observer's physiological parameters comprise at least one ofInter-pupillar distance, Convergence, Ocular media index, Accommodation,Prescription glasses, and Visual system scaling of said observer.
 6. Adevice for 3D rendering a 3D scene for a viewer, the device comprising amemory associated with at least one processor configured for:determining a first intersection point of a fixation axis of a left eyewith an imaging plane, carrying out a first keystone transform of a leftview to a plane orthogonal to the fixation axis of said left eyeresulting in a keystone transformed left view, carrying out a radialtransform of said keystone transformed left view with said firstintersection point as a center resulting in a radial transformed leftview, carrying out an inverse keystone transform of said radialtransformed left view resulting in a processed left view, wherein saidinverse keystone transform is defined as the inverse of said firstkeystone transform, determining a second intersection point of thefixation axis of right eye with the imaging plane, carrying out a secondkeystone transform of a right view in a plane orthogonal to the fixationaxis of said right eye resulting in a keystone transformed right view,carrying out a radial transform of said keystone transformed right viewwith said second intersection point as a center resulting in a radialtransformed right view, and carrying out an inverse keystone transformof said radial transformed right view resulting in a processed rightview, wherein said inverse keystone transform is defined as the inverseof said second keystone transform, wherein keystone and radialtransforms are performed utilizing interpolated parameters, saidinterpolated parameters being an interpolation between parametersestimated for a plurality of points in a 3D space, and rendering saidstereo pair comprising said processed left view and said processed rightview.
 7. Device according to claim 6, wherein the radial transform forthe left eye is performed based on at least one left eye radial scalefactor sL1, sL2, sL3, sLn of different orders applied to a radialdistances “rL” to said first intersection point on the said planeorthogonal to the fixation axis of said left eye, such that thetransforming radial distance rL′=sL1·rL+sL2·(rL)2+sL3·(rL)3+sLn·(rL)n,and wherein the radial transform for the right eye is performed based onat least one right eye radial scale factor sR1, sR2, sR3, sRn ofdifferent orders applied to a radial distances “rR” to said secondintersection point on the said plane orthogonal to the fixation axis ofsaid right eye, such that the transforming radial distancerR′=sR1·rR+sR2·(rR)2+sR3·(rR)3+sRn·(rR)n.
 8. Device according to claim7, wherein respective the radial scale factors of the left eye and righteye are determined by an interactive process by an observer, wherein theinteractive process comprising scaling a rendered shape horizontally andvertically by comparing the rendered shape with a shape of physicalobject.
 9. Device according to claim 6, wherein said at least one lefteye radial scale factors sL1, sL2, sL3, sLn and said at least one righteye radial scale factor sR1, sR2, sR3, sRn are dependent on theobserver's physiological parameters respectively for his/her left eyeand for his/her right eye.
 10. Device according to claim 9, wherein saidobserver's physiological parameters comprise at least one ofInter-pupillar distance, Convergence, Ocular media index, Accommodation,Prescription glasses, and Visual system scaling of said observer. 11.Computer program product downloadable from a communication network orrecorded on a medium readable by computer and executable by a processor,comprising program code instructions for implementing the steps of amethod according to claim
 1. 12. Non-transitory computer-readable mediumcomprising a computer program product recorded thereon and capable ofbeing run by a processor, including program code instructions forimplementing the steps of a method according to claim 1.